704 research outputs found

    A note on optimal sampling strategy for structural variant detection using optical mapping

    Get PDF
    Structural variants compose the majority of human genetic variation, but are difficult to accurately assess using current genomic sequencing technologies. Optical mapping technologies, which measure the size of chromosomal fragments between labeled markers, offer an alternative approach. As these technologies mature toward becoming clinical tools, there is a need to develop an approach for determining the optimal strategy for sampling biological material in order to detect a structural variant at some threshold. Here we develop an optimization approach using a simple, yet realistic, model of the genomic mapping process using a hypergeometric distribution and probabilistic concentration inequalities. Our approach is both computationally and analytically tractable and includes a novel approach to getting tail bounds of hypergeometric distribution. We show that if a genomic mapping technology can sample most of the chromosomal fragments within a sample, comparatively little biological material is needed to detect a variant at high confidence

    Inferential models: A framework for prior-free posterior probabilistic inference

    Full text link
    Posterior probabilistic statistical inference without priors is an important but so far elusive goal. Fisher's fiducial inference, Dempster-Shafer theory of belief functions, and Bayesian inference with default priors are attempts to achieve this goal but, to date, none has given a completely satisfactory picture. This paper presents a new framework for probabilistic inference, based on inferential models (IMs), which not only provides data-dependent probabilistic measures of uncertainty about the unknown parameter, but does so with an automatic long-run frequency calibration property. The key to this new approach is the identification of an unobservable auxiliary variable associated with observable data and unknown parameter, and the prediction of this auxiliary variable with a random set before conditioning on data. Here we present a three-step IM construction, and prove a frequency-calibration property of the IM's belief function under mild conditions. A corresponding optimality theory is developed, which helps to resolve the non-uniqueness issue. Several examples are presented to illustrate this new approach.Comment: 29 pages with 3 figures. Main text is the same as the published version. Appendix B is an addition, not in the published version, that contains some corrections and extensions of two of the main theorem

    Phase-stabilized UV light at 267 nm through twofold second harmonic generation

    Get PDF
    Providing phase stable laser light is important to extend the interrogation time of optical clocks towards many seconds and thus achieve small statistical uncertainties. We report a laser system providing more than 50 µW phase-stabilized UV light at 267.4 nm for an aluminium ion optical clock. The light is generated by frequency-quadrupling a fibre laser at 1069.6 nm in two cascaded non-linear crystals, both in single-pass configuration. In the first stage, a 10 mm long PPLN waveguide crystal converts 1 W fundamental light to more than 0.2 W at 534.8 nm. In the following 50 mm long DKDP crystal, more than 50 µW of light at 267.4 nm are generated. An upper limit for the passive short-term phase stability has been measured by a beat-node measurement with an existing phase-stabilized quadrupling system employing the same source laser. The resulting fractional frequency instability of less than 5×10−17 after 1 s supports lifetime-limited probing of the 27Al+ clock transition, given a sufficiently stable laser source. A further improved stability of the fourth harmonic light is expected through interferometric path length stabilisation of the pump light by back-reflecting it through the entire setup and correcting for frequency deviations. The in-loop error signal indicates an electronically limited instability of 1 × 10−18 at 1 s

    Joint and individual analysis of breast cancer histologic images and genomic covariates

    Get PDF
    A key challenge in modern data analysis is understanding connections between complex and differing modalities of data. For example, two of the main approaches to the study of breast cancer are histopathology (analyzing visual characteristics of tumors) and genetics. While histopathology is the gold standard for diagnostics and there have been many recent breakthroughs in genetics, there is little overlap between these two fields. We aim to bridge this gap by developing methods based on Angle-based Joint and Individual Variation Explained (AJIVE) to directly explore similarities and differences between these two modalities. Our approach exploits Convolutional Neural Networks (CNNs) as a powerful, automatic method for image feature extraction to address some of the challenges presented by statistical analysis of histopathology image data. CNNs raise issues of interpretability that we address by developing novel methods to explore visual modes of variation captured by statistical algorithms (e.g. PCA or AJIVE) applied to CNN features. Our results provide many interpretable connections and contrasts between histopathology and genetics

    Coherent frequentism

    Full text link
    By representing the range of fair betting odds according to a pair of confidence set estimators, dual probability measures on parameter space called frequentist posteriors secure the coherence of subjective inference without any prior distribution. The closure of the set of expected losses corresponding to the dual frequentist posteriors constrains decisions without arbitrarily forcing optimization under all circumstances. This decision theory reduces to those that maximize expected utility when the pair of frequentist posteriors is induced by an exact or approximate confidence set estimator or when an automatic reduction rule is applied to the pair. In such cases, the resulting frequentist posterior is coherent in the sense that, as a probability distribution of the parameter of interest, it satisfies the axioms of the decision-theoretic and logic-theoretic systems typically cited in support of the Bayesian posterior. Unlike the p-value, the confidence level of an interval hypothesis derived from such a measure is suitable as an estimator of the indicator of hypothesis truth since it converges in sample-space probability to 1 if the hypothesis is true or to 0 otherwise under general conditions.Comment: The confidence-measure theory of inference and decision is explicitly extended to vector parameters of interest. The derivation of upper and lower confidence levels from valid and nonconservative set estimators is formalize

    Molecular species identification of Central European ground beetles (Coleoptera: Carabidae) using nuclear rDNA expansion segments and DNA barcodes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The identification of vast numbers of unknown organisms using DNA sequences becomes more and more important in ecological and biodiversity studies. In this context, a fragment of the mitochondrial cytochrome <it>c </it>oxidase I (COI) gene has been proposed as standard DNA barcoding marker for the identification of organisms. Limitations of the COI barcoding approach can arise from its single-locus identification system, the effect of introgression events, incomplete lineage sorting, numts, heteroplasmy and maternal inheritance of intracellular endosymbionts. Consequently, the analysis of a supplementary nuclear marker system could be advantageous.</p> <p>Results</p> <p>We tested the effectiveness of the COI barcoding region and of three nuclear ribosomal expansion segments in discriminating ground beetles of Central Europe, a diverse and well-studied invertebrate taxon. As nuclear markers we determined the 18S rDNA: V4, 18S rDNA: V7 and 28S rDNA: D3 expansion segments for 344 specimens of 75 species. Seventy-three species (97%) of the analysed species could be accurately identified using COI, while the combined approach of all three nuclear markers provided resolution among 71 (95%) of the studied Carabidae.</p> <p>Conclusion</p> <p>Our results confirm that the analysed nuclear ribosomal expansion segments in combination constitute a valuable and efficient supplement for classical DNA barcoding to avoid potential pitfalls when only mitochondrial data are being used. We also demonstrate the high potential of COI barcodes for the identification of even closely related carabid species.</p
    corecore